Data science is the discipline of making data useful. Ok…so what is it?
Engineering (infrastructure and production): the process of making everything else possible
Analysis: the process of turning raw information into insights in a fast way
Modeling/Inference: the process of diving deeper into the data to discover the pattern we don’t easily see
(It is a group work from https://github.com/brohrer/academic_advisory/blob/master/authors.md !)
Data environment: data storage, Kafka platform, Hadoop and Spark cluster etc.
Production: integrate model and analysis into the production system
Domain knowledge
Exploratory analysis
Story telling
Supervised learning
Unsupervised learning
Customized model development